[IR Container] Phase 2.4 Per-fusion statement tracking #5961
[IR Container] Phase 2.4 Per-fusion statement tracking #5961mdavis36 wants to merge 2 commits intomd/phase2-shared-ptrfrom
Conversation
|
Review updated until commit 2e491e9 Description
|
| Relevant files | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Enhancement |
| ||||||||
| Bug fix |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Iterator Invalidation Risk
removeStatementsOwnedBy, the code uses std::erase_if on vals_up_ and exprs_up_ deques while iterating over owned sets. While std::erase_if handles this correctly, the pattern of erasing elements during iteration over a different container (owned) should be validated to ensure no edge cases exist when the same pointer appears in multiple fusion ownership sets. |
3a199c8 to
53e5045
Compare
|
!test |
33629cb to
8b162d9
Compare
53e5045 to
f8ff364
Compare
|
!test |
Greptile SummaryThis PR implements per-Fusion ownership tracking maps ( Key changes and issues found:
Confidence Score: 3/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant FusionA as Fusion A
participant FusionB as Fusion B
participant C as shared IrContainer
participant SG as StatementGuard
FusionA->>C: registerVal(v1) → per_fusion_vals_[A].insert(v1)
FusionA->>C: registerExpr(e1) → per_fusion_exprs_[A].insert(e1)
FusionB->>C: registerVal(v2) → per_fusion_vals_[B].insert(v2)
FusionB->>C: registerExpr(e2) → per_fusion_exprs_[B].insert(e2)
Note over C: vals_up_ = [v1, v2]<br/>exprs_up_ = [e1, e2]
SG->>FusionA: StatementGuard(A) captures numExprs(A)=1, numValsExcludingShortcuts(A)=1
FusionA->>C: registerVal(v3) → per_fusion_vals_[A].insert(v3)
FusionA->>C: registerExpr(e3) → per_fusion_exprs_[A].insert(e3)
Note over C: vals_up_ = [v1, v2, v3]<br/>exprs_up_ = [e1, e2, e3]
SG->>FusionA: ~StatementGuard → removeStatementsCreatedAfter(1, 1)
FusionA->>C: exprs_up_.back()=e3 ✓ owned by A → pop, erase
FusionA->>C: vals_up_.back()=v3 ✓ owned by A → pop, erase
Note over C: Restored: vals_up_=[v1,v2], exprs_up_=[e1,e2]
FusionA->>C: removeStatementsOwnedBy(A) [Fusion::clear()]
C->>C: erase_if vals_up_ where owned[A] → destroys v1
Note over C: ⚠ Val::uses_ on v1 not cleaned up<br/>(safe only if v1 not cross-referenced)
C->>C: erase_if exprs_up_ where owned[A] → destroys e1
Last reviewed commit: 50cb886 |
Additional Comments (2)
In In a shared container scenario where another Fusion has appended statements after the
The same issue applies to the This is safe for single-Fusion containers today, but since this PR's stated goal is enabling shared-container safety, the rollback path should also clean up the correct owner's per-fusion map — e.g., by looking up which Fusion actually owns each statement being removed.
More importantly, For single-Fusion containers this is fine, but these methods should be updated to return per-Fusion counts (or at least documented as container-global) before shared containers are actually used. |
8b162d9 to
b8d202d
Compare
|
!test |
## Summary Review fixes for PR #5961 (Per-Fusion statement tracking): - **O(n²) → O(n)**: Optimize `removeStatementsOwnedBy` with `std::erase_if` - **Per-Fusion counts**: Convert `numExprs()`/`numVals()` to return per-Fusion counts instead of global - **StatementGuard fixes**: Snapshot and compare per-Fusion counts for correct LIFO rollback in shared containers - **LIFO assertions**: Verify tail elements belong to this Fusion before popping ## Tests All tests pass: - ✅ StatementGuardTest.ExecuteAfterGuard - ✅ StatementGuardTest.LazySpecialValsNotDangling - ✅ FusionCopy_CUDA - ✅ FusionMove_CUDA
c965408 to
5339bf5
Compare
## Summary Review fixes for PR #5961 (Per-Fusion statement tracking): - **O(n²) → O(n)**: Optimize `removeStatementsOwnedBy` with `std::erase_if` - **Per-Fusion counts**: Convert `numExprs()`/`numVals()` to return per-Fusion counts instead of global - **StatementGuard fixes**: Snapshot and compare per-Fusion counts for correct LIFO rollback in shared containers - **LIFO assertions**: Verify tail elements belong to this Fusion before popping ## Tests All tests pass: - ✅ StatementGuardTest.ExecuteAfterGuard - ✅ StatementGuardTest.LazySpecialValsNotDangling - ✅ FusionCopy_CUDA - ✅ FusionMove_CUDA
bc595c5 to
9f944b4
Compare
Additional Comments (2)
The This is actively triggered by The same pattern repeats for The fix is to extract the source set before any mutation of the map: auto vals_it = per_fusion_vals_.find(from);
if (vals_it != per_fusion_vals_.end()) {
// Copy the source set before operator[] can rehash and invalidate vals_it.
auto from_vals = std::move(vals_it->second);
per_fusion_vals_.erase(vals_it);
per_fusion_vals_[to].insert(from_vals.begin(), from_vals.end());
}
auto exprs_it = per_fusion_exprs_.find(from);
if (exprs_it != per_fusion_exprs_.end()) {
auto from_exprs = std::move(exprs_it->second);
per_fusion_exprs_.erase(exprs_it);
per_fusion_exprs_[to].insert(from_exprs.begin(), from_exprs.end());
}
The expr-removal loop unconditionally pops from For a single-Fusion container this holds, but once two Fusions share a container (the stated goal of Phase 2), another Fusion may register statements into the shared deque after this Effectively, One possible approach for future-proofing: instead of a deque tail pop, iterate |
5339bf5 to
24e4f69
Compare
## Summary Review fixes for PR #5961 (Per-Fusion statement tracking): - **O(n²) → O(n)**: Optimize `removeStatementsOwnedBy` with `std::erase_if` - **Per-Fusion counts**: Convert `numExprs()`/`numVals()` to return per-Fusion counts instead of global - **StatementGuard fixes**: Snapshot and compare per-Fusion counts for correct LIFO rollback in shared containers - **LIFO assertions**: Verify tail elements belong to this Fusion before popping ## Tests All tests pass: - ✅ StatementGuardTest.ExecuteAfterGuard - ✅ StatementGuardTest.LazySpecialValsNotDangling - ✅ FusionCopy_CUDA - ✅ FusionMove_CUDA
9f944b4 to
7b3ce1f
Compare
Additional Comments (3)
The same hazard applies in the A safe alternative is to extract the node first, which is both iterator-safe and avoids copying the entire set: void IrContainer::transferStatementOwnership(
const Fusion* from,
const Fusion* to) {
auto vals_node = per_fusion_vals_.extract(from);
if (vals_node) {
auto& to_vals = per_fusion_vals_[to];
to_vals.insert(
std::make_move_iterator(vals_node.mapped().begin()),
std::make_move_iterator(vals_node.mapped().end()));
}
auto exprs_node = per_fusion_exprs_.extract(from);
if (exprs_node) {
auto& to_exprs = per_fusion_exprs_[to];
to_exprs.insert(
std::make_move_iterator(exprs_node.mapped().begin()),
std::make_move_iterator(exprs_node.mapped().end()));
}
}
For single-Fusion containers this is currently safe because every owned Removing exprs first (before destroying vals) and calling
The same pattern appears in the val loop on line 448 ( Prefer a read-only lookup to avoid the accidental mutation: |
24e4f69 to
fab1748
Compare
## Summary Review fixes for PR #5961 (Per-Fusion statement tracking): - **O(n²) → O(n)**: Optimize `removeStatementsOwnedBy` with `std::erase_if` - **Per-Fusion counts**: Convert `numExprs()`/`numVals()` to return per-Fusion counts instead of global - **StatementGuard fixes**: Snapshot and compare per-Fusion counts for correct LIFO rollback in shared containers - **LIFO assertions**: Verify tail elements belong to this Fusion before popping ## Tests All tests pass: - ✅ StatementGuardTest.ExecuteAfterGuard - ✅ StatementGuardTest.LazySpecialValsNotDangling - ✅ FusionCopy_CUDA - ✅ FusionMove_CUDA
7b3ce1f to
2e491e9
Compare
Additional Comments (3)
Safe fix — extract the Alternatively,
For a shared container this is a latent bug: if a Val owned by Fusion B has Fusion A's Expr in its For the current single-Fusion use case all Vals are destroyed in the same call, so the stale
Both loops pop from the global Consider: The |
These are pretty much all addressed by changes in later chained PRs. They latter two are specifically addressed in #5983 as this came back to bite me w/ parallel compilation for shared containers across segments. |
fab1748 to
dcc99d7
Compare
Add per_fusion_vals_ / per_fusion_exprs_ maps to IrContainer so each Fusion can efficiently query only its own statements in a shared container. Fusion forwarding methods (vals(), unordered_exprs(), deterministic_vals(), etc.) now return per-Fusion filtered results. Fusion::clear() uses removeStatementsOwnedBy(this) instead of ir_container()->clear().
## Summary Review fixes for PR #5961 (Per-Fusion statement tracking): - **O(n²) → O(n)**: Optimize `removeStatementsOwnedBy` with `std::erase_if` - **Per-Fusion counts**: Convert `numExprs()`/`numVals()` to return per-Fusion counts instead of global - **StatementGuard fixes**: Snapshot and compare per-Fusion counts for correct LIFO rollback in shared containers - **LIFO assertions**: Verify tail elements belong to this Fusion before popping ## Tests All tests pass: - ✅ StatementGuardTest.ExecuteAfterGuard - ✅ StatementGuardTest.LazySpecialValsNotDangling - ✅ FusionCopy_CUDA - ✅ FusionMove_CUDA
2e491e9 to
50cb886
Compare
Additional Comments (3)
When removing expressions via // removeStatementsCreatedAfter (correct):
for (Val* in : e->inputs()) {
in->removeUse(e);
}For the current single-Fusion case ( Consider adding the same cleanup here to make the invariant hold regardless of sharing: std::erase_if(exprs_up_, [&](const std::unique_ptr<Expr>& e) {
if (owned.count(e.get()) > 0) {
for (Val* in : e->inputs()) {
in->removeUse(e.get());
}
exprs_.erase(e.get());
return true;
}
return false;
});
Expr* e = c->exprs_up_.back().get();
NVF_ERROR(
c->per_fusion_exprs_[this].count(e) > 0,
"removeStatementsCreatedAfter: tail expr belongs to another Fusion");This is safe today (single-Fusion containers) but fails in the target Phase 2 multi-Fusion scenario. Consider Fusion A and B sharing a container where When A's guard destructs, it checks The
The function subtracts only the 5 named shortcut fields ( If a typed shortcut val (e.g., The existing |
|
!test |
Summary
Add per-Fusion ownership tracking maps to IrContainer so each Fusion can efficiently query only its own Vals and Exprs within a shared container. Update all Fusion-level accessors to filter by ownership.
This is the highest-risk change in the Phase 2 chain. Every accessor call path is touched —
vals(),deterministic_vals(),deterministic_exprs(),unordered_exprs(), and more now return ownership-filtered results rather than raw container contents. For single-Fusion containers the results are identical, but the implementation changes underneath every consumer.Relationship to Phase 2
This is the critical invariant that makes shared containers safe:
Without per-Fusion filtering, a shared container copy would break every consumer —
Fusion::copywould iterate all vals (including other Fusions'),deterministic_vals()would return interleaved results, andStatementGuardrollback would destroy statements belonging to other Fusions.This invariant is what allows Phase 2 to maintain independent IR graphs despite shared storage:
fusion->vals()returns{v : v in container AND v->container() == fusion}Fusion::clear()only clears THIS Fusion's state (notir_container()->clear())CI Risk
Highest of all Phase 2 PRs. Every accessor-dependent code path is touched. For single-Fusion containers, filtered results are identical to unfiltered — but regressions would surface in accessor-heavy code paths (scheduling, lowering, codegen).